Clustering and OCCC Approaches in Document Re-ranking
نویسندگان
چکیده
In this paper, we describe our approach for information retrieval for question answering (IR4QA) of NTCIR-8 tasks. For improving information retrieval performance, we focus mostly on the document re-ranking technique, which locates between the first retrieval documents and query expansion. In this paper, we employ two approaches in document re-ranking. One is based on entropy clustering, a kind of unsupervised learning technology. Relevant documents from top initial retrieval result can be automatically clustered same class according to information entropy values. That is a continuation of our previous work. The other is One Class Co-Clustering (OCCC) approach. it aims to detect topical terms, and compute document’s topicality score. The method is simple and performs well. The experiment result shows using the two approaches in Document Reranking, Clustering and OCCC, can improve information retrieval performance.
منابع مشابه
THUIR at TREC2008: Relevance Feedback Track1
Tsinghua University Information Retrieval Group (THUIR) has participated into the first Relevance Feedback Track of TREC2008. The TMiner search engine has been used as our text retrieval system, because the processing capability and flexibility of this system on large text data has been testified during many years’ Web Track and Terabyte Track. In the track, we studied two approaches: 1) query ...
متن کاملTHUIR at NTCIR-10 INTENT-2 Task
This paper describes our approaches and results in NTCIR10 INTENT-2 task. In this year, we participate in subtasks for both the Chinese and English topics. We extract subtopics from multiple resources for these topics, and several subtopic clustering and re-ranking methods are proposed in this work. In Document Ranking subtask, we redefine the novelty of a document and use the new definition to...
متن کاملTHUIR at TREC 2008: Relevance Feedback Track
Tsinghua University Information Retrieval Group (THUIR) has participated into the first Relevance Feedback Track of TREC2008. The TMiner search engine has been used as our text retrieval system, because the processing capability and flexibility of this system on large text data has been testified during many years’ Web Track and Terabyte Track. In the track, we studied two approaches: 1) query ...
متن کاملCircular Re-ranking for Visual Search
Conventional approaches to visual search re-ranking empirically take the “classification performance” as the optimization objective, in which each visual document is determined relevant or not, followed by a process of increasing the order of relevant documents. First show that the classification performance fails to produce a globally optimal ranked list, and then formulate re-ranking as an op...
متن کاملDocument Re-ranking via Wikipedia Articles for Definition/Biography Type Questions
In this paper, we propose a document re-ranking approach based on the Wikipedia articles related to the specific questions to re-order the initial retrieved documents to improve the precision of top retrieved documents in Chinese information retrieval for question answering (IR4QA) system where the questions are definition or biography type. On one hand, we compute the similarity between each d...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010